Goto

Collaborating Authors

 kernel shap



Explaining Deep Learning-based Anomaly Detection in Energy Consumption Data by Focusing on Contextually Relevant Data

Noorchenarboo, Mohammad, Grolinger, Katarina

arXiv.org Artificial Intelligence

Detecting anomalies in energy consumption data is crucial for identifying energy waste, equipment malfunction, and overall, for ensuring efficient energy management. Machine learning, and specifically deep learning approaches, have been greatly successful in anomaly detection; however, they are black-box approaches that do not provide transparency or explanations. SHAP and its variants have been proposed to explain these models, but they suffer from high computational complexity (SHAP) or instability and inconsistency (e.g., Kernel SHAP). To address these challenges, this paper proposes an explainability approach for anomalies in energy consumption data that focuses on context-relevant information. The proposed approach leverages existing explainability techniques, focusing on SHAP variants, together with global feature importance and weighted cosine similarity to select background dataset based on the context of each anomaly point. By focusing on the context and most relevant features, this approach mitigates the instability of explainability algorithms. Experimental results across 10 different machine learning models, five datasets, and five XAI techniques, demonstrate that our method reduces the variability of explanations providing consistent explanations. Statistical analyses confirm the robustness of our approach, showing an average reduction in variability of approximately 38% across multiple datasets.


Reviews: A Unified Approach to Interpreting Model Predictions

Neural Information Processing Systems

The authors show that several methods in the literature used for explaining individual model predictions fall into the category of "additive feature attribution" methods. They proposes a new kind of additive feature attribution method based on the concept of Shapely values and call the resulting explanations the SHAP values. The authors also suggest a new kernel called the shapely kernel which can be used to compute SHAP values via linear regression (a method they call kernel SHAP). They discuss how other methods, such as DeepLIFT, can be improved by better approximating the Shapely values. Summary of review: Positives: (1) Novel and sound theoretical framework for approaching the question of model explanations, which has been very lacking in the field (most other methods were developed ad-hoc).



Provably Accurate Shapley Value Estimation via Leverage Score Sampling

Musco, Christopher, Witter, R. Teal

arXiv.org Artificial Intelligence

Originally introduced in game theory, Shapley values have emerged as a central tool in explainable machine learning, where they are used to attribute model predictions to specific input features. However, computing Shapley values exactly is expensive: for a general model with $n$ features, $O(2^n)$ model evaluations are necessary. To address this issue, approximation algorithms are widely used. One of the most popular is the Kernel SHAP algorithm, which is model agnostic and remarkably effective in practice. However, to the best of our knowledge, Kernel SHAP has no strong non-asymptotic complexity guarantees. We address this issue by introducing Leverage SHAP, a light-weight modification of Kernel SHAP that provides provably accurate Shapley value estimates with just $O(n\log n)$ model evaluations. Our approach takes advantage of a connection between Shapley value estimation and agnostic active learning by employing leverage score sampling, a powerful regression tool. Beyond theoretical guarantees, we show that Leverage SHAP consistently outperforms even the highly optimized implementation of Kernel SHAP available in the ubiquitous SHAP library [Lundberg & Lee, 2017].


Interpretable breast cancer classification using CNNs on mammographic images

Balve, Ann-Kristin, Hendrix, Peter

arXiv.org Artificial Intelligence

Deep learning models have achieved promising results in breast cancer classification, yet their 'black-box' nature raises interpretability concerns. This research addresses the crucial need to gain insights into the decision-making process of convolutional neural networks (CNNs) for mammogram classification, specifically focusing on the underlying reasons for the CNN's predictions of breast cancer. For CNNs trained on the Mammographic Image Analysis Society (MIAS) dataset, we compared the post-hoc interpretability techniques LIME, Grad-CAM, and Kernel SHAP in terms of explanatory depth and computational efficiency. The results of this analysis indicate that Grad-CAM, in particular, provides comprehensive insights into the behavior of the CNN, revealing distinctive patterns in normal, benign, and malignant breast tissue. We discuss the implications of the current findings for the use of machine learning models and interpretation techniques in clinical practice.


Progressive Inference: Explaining Decoder-Only Sequence Classification Models Using Intermediate Predictions

Kariyappa, Sanjay, Lécué, Freddy, Mishra, Saumitra, Pond, Christopher, Magazzeni, Daniele, Veloso, Manuela

arXiv.org Machine Learning

This paper proposes Progressive Inference - a framework to compute input attributions to explain the predictions of decoder-only sequence classification models. Our work is based on the insight that the classification head of a decoder-only Transformer model can be used to make intermediate predictions by evaluating them at different points in the input sequence. Due to the causal attention mechanism, these intermediate predictions only depend on the tokens seen before the inference point, allowing us to obtain the model's prediction on a masked input sub-sequence, with negligible computational overheads. We develop two methods to provide sub-sequence level attributions using this insight. First, we propose Single Pass-Progressive Inference (SP-PI), which computes attributions by taking the difference between consecutive intermediate predictions. Second, we exploit a connection with Kernel SHAP to develop Multi Pass-Progressive Inference (MP-PI). MP-PI uses intermediate predictions from multiple masked versions of the input to compute higher quality attributions. Our studies on a diverse set of models trained on text classification tasks show that SP-PI and MP-PI provide significantly better attributions compared to prior work.


Interpretable Machine Learning for TabPFN

Rundel, David, Kobialka, Julius, von Crailsheim, Constantin, Feurer, Matthias, Nagler, Thomas, Rügamer, David

arXiv.org Machine Learning

The recently developed Prior-Data Fitted Networks (PFNs) have shown very promising results for applications in low-data regimes. The TabPFN model, a special case of PFNs for tabular data, is able to achieve state-of-the-art performance on a variety of classification tasks while producing posterior predictive distributions in mere seconds by in-context learning without the need for learning parameters or hyperparameter tuning. This makes TabPFN a very attractive option for a wide range of domain applications. However, a major drawback of the method is its lack of interpretability. Therefore, we propose several adaptations of popular interpretability methods that we specifically design for TabPFN. By taking advantage of the unique properties of the model, our adaptations allow for more efficient computations than existing implementations. In particular, we show how in-context learning facilitates the estimation of Shapley values by avoiding approximate retraining and enables the use of Leave-One-Covariate-Out (LOCO) even when working with large-scale Transformers. In addition, we demonstrate how data valuation methods can be used to address scalability challenges of TabPFN.


Shaping Up SHAP: Enhancing Stability through Layer-Wise Neighbor Selection

Kelodjou, Gwladys, Rozé, Laurence, Masson, Véronique, Galárraga, Luis, Gaudel, Romaric, Tchuente, Maurice, Termier, Alexandre

arXiv.org Artificial Intelligence

Machine learning techniques, such as deep learning and ensemble methods, are widely used in various domains due to their ability to handle complex real-world tasks. However, their black-box nature has raised multiple concerns about the fairness, trustworthiness, and transparency of computer-assisted decision-making. This has led to the emergence of local post-hoc explainability methods, which offer explanations for individual decisions made by black-box algorithms. Among these methods, Kernel SHAP is widely used due to its model-agnostic nature and its well-founded theoretical framework. Despite these strengths, Kernel SHAP suffers from high instability: different executions of the method with the same inputs can lead to significantly different explanations, which diminishes the utility of post-hoc explainability. The contribution of this paper is two-fold. On the one hand, we show that Kernel SHAP's instability is caused by its stochastic neighbor selection procedure, which we adapt to achieve full stability without compromising explanation fidelity. On the other hand, we show that by restricting the neighbors generation to perturbations of size 1 -- which we call the coalitions of Layer 1 -- we obtain a novel feature-attribution method that is fully stable, efficient to compute, and still meaningful.


WebSHAP: Towards Explaining Any Machine Learning Models Anywhere

Wang, Zijie J., Chau, Duen Horng

arXiv.org Artificial Intelligence

As machine learning (ML) is increasingly integrated into our everyday Web experience, there is a call for transparent and explainable web-based ML. However, existing explainability techniques often require dedicated backend servers, which limit their usefulness as the Web community moves toward in-browser ML for lower latency and greater privacy. To address the pressing need for a client-side explainability solution, we present WebSHAP, the first in-browser tool that adapts the state-of-the-art model-agnostic explainability technique SHAP to the Web environment. Our open-source tool is developed with modern Web technologies such as WebGL that leverage client-side hardware capabilities and make it easy to integrate into existing Web ML applications. We demonstrate WebSHAP in a usage scenario of explaining ML-based loan approval decisions to loan applicants. Reflecting on our work, we discuss the opportunities and challenges for future research on transparent Web ML. WebSHAP is available at https://github.com/poloclub/webshap.